Providing deterministic frequency and voltage enhancements for a processor

ABSTRACT

Providing deterministic frequency and voltage enhancements for a processor is disclosed. In an embodiment, a microcontroller on a processor identifies a plurality of parameters related to a processor, the plurality of parameters including at least a current supplied to the processor; determines, in dependence upon the plurality of parameters, one or more frequency scaling indexes including determining an effective switching capacitance ratio; identifies, in dependence upon the one or more frequency scaling indexes, a predetermined frequency parameter for the processor; and transitions, based on the frequency parameter, the processor to a target clock frequency. In another embodiment, a microcontroller on a processor decreases, incrementally, a power supply voltage for a processor; determines that a voltage droop parameter exceeds a voltage droop parameter threshold; and increases, incrementally, the power supply voltage in response to determining that the voltage droop parameter exceeds a voltage droop parameter threshold.

BACKGROUND

The development of the EDVAC computer system of 1948 is often cited asthe beginning of the computer era. Since that time, computer systemshave evolved into extremely complicated devices. Today’s computers aremuch more sophisticated than early systems such as the EDVAC. Computersystems typically include a combination of hardware and softwarecomponents, application programs, operating systems, processors, buses,memory, input/output devices, and so on. As advances in semiconductorprocessing and computer architecture push the performance of thecomputer higher and higher, more sophisticated computer software hasevolved to take advantage of the higher performance of the hardware,resulting in computer systems today that are much more powerful thanjust a few years ago.

Shrinking transistor sizes allow increased logic complexity in modemprocessors, but smaller dimensions increase power density and requirereduced maximum voltage (VDD_(MAX)) for reliability. This can severelylimit the performance achievable in new technologies.

SUMMARY

Embodiments in accordance with the present disclosure providedeterministic frequency and voltage enhancements. A workload optimizedfrequency control loop deterministically maximizes frequency based on amultidimensional analysis of processor states and conditions. Digitaldroop sensors use core-throttling or adaptive clock to mitigatemicroprocessor voltage droops. Robust droop mitigation facilitates avoltage control loop to minimize voltage. This voltage control loopoffsets load line uplift, keeping voltage below the reliabilityVDD_(MAX), while protecting against performance loss from excessivedroop mitigation.

An embodiment in accordance with the present disclosure is directed to amethod of providing deterministic frequency and voltage enhancements fora processor. The method includes identifying a plurality of parametersrelated to a processor, the plurality of parameters including at least acurrent supplied to the processor. The method also includes determining,in dependence upon the plurality of parameters, one or more frequencyscaling indexes including determining an effective switching capacitanceratio. The method also includes identifying, in dependence upon the oneor more frequency scaling indexes, a predetermined frequency parameterfor the processor. The method further includes transitioning, based onthe frequency parameter, the processor to a target clock frequency. Insome examples, the plurality of parameters further includes one or moreof: an ambient temperature, an altitude, one or more input/output (I/O)configuration parameters, one or more core power states, one or morecore clock states, an average voltage, and an average frequency. In someexamples, the predetermined frequency parameter is identified from atable that maps the one or more frequency scaling indexes to thepredetermined frequency parameter.

In some variations of the embodiment, determining, based on theplurality of parameters, one or more frequency scaling indexes includesdetermining a core activity state index. In some variations,determining, based on the plurality of parameters, one or more frequencyscaling indexes includes determining an input/output power index. Insome variations, determining, based on the plurality of parameters, oneor more frequency scaling indexes includes determining an ambientconditions index. In some examples, transitioning, based on the targetclock frequency, the processor to a target clock frequency includessetting, based on the target clock frequency, a target power supplyvoltage for the processor.

In some variations this embodiment, the method also includes decreasing,incrementally, the power supply voltage for the processor, determiningthat a voltage droop parameter exceeds a voltage droop parameterthreshold, and increasing, incrementally, the power supply voltage inresponse to determining that the voltage droop parameter exceeds avoltage droop parameter threshold.

In some variations of this embodiment, the method further includesdetecting a voltage droop based on a core voltage falling below a corevoltage threshold, throttling one or more regions of the core inresponse to detecting the voltage droop, and decreasing, incrementally,an amount of throttling based on an increase in core voltage. In someexamples, the core voltage threshold is adjusted dynamically in responseto transitioning, based on the frequency parameter, the processor to atarget clock frequency.

Another embodiment is directed to an apparatus comprising a processorand a memory storing instructions that, when executed by the processor,configure the apparatus to identify a plurality of parameters related toa processor, the plurality of parameters including at least a currentsupplied to the processor. The instructions further configure theapparatus to determine, in dependence upon the plurality of parameters,one or more frequency scaling indexes including determining an effectiveswitching capacitance ratio. The instructions further configure theapparatus to identify, in dependence upon the one or more frequencyscaling indexes, a predetermined frequency parameter for the processor.The instructions further configure the apparatus to transition, based onthe frequency parameter, the processor to a target clock frequency. Insome examples, the predetermined frequency parameter is identified froma table that maps the one or more frequency scaling indexes to thepredetermined frequency parameter.

In some variations of the embodiment, determining, based on theplurality of parameters, one or more frequency scaling indexes includesdetermining a core activity state index. In some variations,determining, based on the plurality of parameters, one or more frequencyscaling indexes includes determining an input/output power index. Insome variations, determining, based on the plurality of parameters, oneor more frequency scaling indexes includes determining an ambientconditions index. In some examples, transitioning, based on the targetclock frequency, the processor to a target clock frequency includessetting, based on the target clock frequency, a target power supplyvoltage for the processor.

Another embodiment in accordance with the present disclosure is directedto a computer program product comprising a non-transitorycomputer-readable medium storing computer program instructions that,when executed, cause a computer to identify a plurality of parametersrelated to a processor, the plurality of parameters including at least acurrent supplied to the processor. The instructions further cause thecomputer to determine, in dependence upon the plurality of parameters,one or more frequency scaling indexes including determining an effectiveswitching capacitance ratio. The instructions further cause the computerto identify, in dependence upon the one or more frequency scalingindexes, a predetermined frequency parameter for the processor. Theinstructions further cause the computer to transition, based on thefrequency parameter, the processor to a target clock frequency.

In some variations of the embodiment, determining, based on theplurality of parameters, one or more frequency scaling indexes includesdetermining a core activity state index. In some variations,determining, based on the plurality of parameters, one or more frequencyscaling indexes includes determining an input/output power index. Insome variations, determining, based on the plurality of parameters, oneor more frequency scaling indexes includes determining an ambientconditions index. In some examples, transitioning, based on the targetclock frequency, the processor to a target clock frequency includessetting, based on the target clock frequency, a target power supplyvoltage for the processor.

Another embodiment in accordance with the present disclosure is directedto another method of providing deterministic frequency and voltageenhancements on a processor. The method includes decreasing,incrementally, a power supply voltage for a processor. The method alsoincludes determining that a voltage droop parameter exceeds a voltagedroop parameter threshold. The method further includes increasing,incrementally, the power supply voltage in response to determining thatthe voltage droop parameter exceeds a voltage droop parameter threshold.

In some examples, the method also includes detecting a voltage droopbased on a core voltage falling below a core voltage threshold and, inresponse to detecting the voltage droop, throttling one or more regionsof the core. The method further includes decreasing, incrementally, anamount of throttling based on an increase in core voltage.

In some variations, the core voltage threshold is adjusted dynamicallyin response to transitioning, based on the frequency parameter, theprocessor to a target clock frequency. In some variations, the voltagedroop parameter is at least one of a number of voltage droop events, arate of voltage droop events, a number of cycles that a droop mitigationaction is active, and a fraction of cycles that the droop mitigationaction is active. In some variations, size of a power supply voltageincrement is dynamically selected based on the voltage droop parameter;and wherein a size of a power supply voltage decrement is dynamicallyselected based on the voltage droop parameter.

The foregoing and other objects, features and advantages of theinvention will be apparent from the following more particulardescriptions of exemplary embodiments of the invention as illustrated inthe accompanying drawings wherein like reference numbers generallyrepresent like parts of exemplary embodiments of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 sets forth a diagram of an example processor for providingdeterministic frequency and voltage enhancements for a processor inaccordance with some embodiments of the present disclosure.

FIG. 2 sets forth a graphical representation of an example frequencyclip table for a core activity state in accordance with some embodimentsof the present disclosure.

FIG. 3 sets forth a graphical representation of an example frequencyclip table for I/O power in accordance with some embodiments of thepresent disclosure.

FIG. 4 sets forth a graphical representation of an example frequencyclip table for ambient conditions in accordance with some embodiments ofthe present disclosure.

FIG. 5 sets forth a block diagram of an example droop mitigation systemin accordance with some embodiments of the present disclosure.

FIG. 6 sets forth a flow chart illustrating an example method ofproviding deterministic frequency and voltage enhancements for aprocessor in accordance with some embodiments of the present disclosure.

FIG. 7 sets forth a flow chart illustrating another example method ofproviding deterministic frequency and voltage enhancements for aprocessor in accordance with some embodiments of the present disclosure.

FIG. 8 sets forth a flow chart illustrating another example method ofproviding deterministic frequency and voltage enhancements for aprocessor in accordance with some embodiments of the present disclosure.

FIG. 9 sets forth a flow chart illustrating another example method ofproviding deterministic frequency and voltage enhancements for aprocessor in accordance with some embodiments of the present disclosure.

FIG. 10 sets forth a flow chart illustrating another example method ofproviding deterministic frequency and voltage enhancements for aprocessor in accordance with some embodiments of the present disclosure.

FIG. 11 sets forth a flow chart illustrating another example method ofproviding deterministic frequency and voltage enhancements for aprocessor in accordance with some embodiments of the present disclosure.

FIG. 12 sets forth a flow chart illustrating another example method ofproviding deterministic frequency and voltage enhancements for aprocessor in accordance with some embodiments of the present disclosure.

FIG. 13 sets forth a flow chart illustrating another example method ofproviding deterministic frequency and voltage enhancements for aprocessor in accordance with some embodiments of the present disclosure.

DETAILED DESCRIPTION

Exemplary methods, apparatus, and products for providing deterministicfrequency and voltage enhancements for a processor in accordance withthe present invention are described with reference to the accompanyingdrawings, beginning with FIG. 1 . FIG. 1 sets forth a diagram of anexample processor 100 for providing deterministic frequency and voltageenhancements for a processor in accordance with some embodiments of thepresent disclosure. The example processor 100 illustrated in FIG. 1depicts an example layout of an example processor architecture. It willbe appreciated that embodiments in accordance with the presentdisclosure are not limited to the example layout and architecture shownin FIG. 1 . In fact, embodiments of the present disclosure may beadapted to any processor without departing from the spirit of thepresent disclosure.

The example processor 100 includes multiple processor cores 108 andtheir associated caches, such as layer 2 (L2) and layer 5 (L3) caches.The example processor 100 also includes input/output (I/O) interfacessuch as one or more memory interfaces 170, one or more peripheralcomputer interconnect express (PCIe) interfaces 172, one or more systemI/O interfaces 174, and one or more synchronous multiprocessing (SMP)interfaces 178. The example processor 100 also includes a system clock160. In some examples, the system clock 160 is implemented by a digitalphase locked loop (DPLL). The system clock 160 provides the clock forthe cores 108 and their caches through a clock network (not shown). Theexample processor 100 also includes one or more nonvolatile memory units176. In various examples, the nonvolatile memory units 176 may be ROM,EPROM, EEPROM, or Flash memory units. The nonvolatile memory units 176may store firmware, such as microcode instruction executable by theprocessor, that controls power and thermal management of processor. Inone example, a nonvolatile memory unit 176 stores firmware 190 embodyinga deterministic workload optimized frequency (WOF) controller 152, whichis a control loop that provides deterministic frequency enhancements forthe system clock 160 based on observable parameters and conditions inand around the processor 100, processor characterizations, and processormodels. The WOF controller 152 will be described in greater detailbelow. In another example, a nonvolatile memory unit 176 stores firmware192 embodying an undervolt controller 158, which is a control loopproviding voltage enhancements to lower VDD based on workload andobserved voltage droops. The undervolt controller 158 will be describedin greater detail below. In some examples, one or more nonvolatilememory units 176 store reference data 162 such as processorcharacterization data, processor models, reference tables,characteristic equations, frequency clip tables, and other suchinformation useful to the WOF controller 152 and/or the undervoltcontroller 158, as will be made apparent below. For example, thereference data 162 may include module vital product data (MVPD) 164,which is described in more detail below.

In some examples, the processor includes an on-chip controller 124 thatimplements the WOF controller 152 and the undervolt controller 158. Theon-chip controller 124 includes an embedded core 126 working with one ormore general purpose engine (GPE) microcontrollers 128. In someexamples, the on-chip controller 124 executes firmware instructions 190embodying the WOF controller 152 that are stored in a nonvolatile memoryunit 176, which are executable by the core 126 of the on-chip controller124 to carry out a WOF control loop described in more detail below. Forexample, the firmware instruction may be loaded into staticrandom-access memory (SRAM) 194 for execution by the core 126. In someexamples, the on-chip controller 124 executes firmware instructions 192embodying the undervolt controller 158 that are stored in a nonvolatilememory unit 176, which are executable by the core 126 of the on-chipcontroller 124 to carry out an undervolt control loop described in moredetail below. For example, the firmware instruction may be loaded into aSRAM 194 for execution by the core 126. In some examples, the on-chipcontroller 124 also includes a voltage regulation module (VRM) interface154 through which the on-chip controller 124 receives VDD voltage andcurrent information for the processor 100 from the VRM 140, and throughwhich the on-chip controller 124 sets a VDD voltage parameter (e.g., ina VRM register) that is used by the VRM 140 to control the VDD voltage.In some examples, the on-chip controller 124 also includes an on-chipinfrastructure interface 156 that provides access to data from digitalthermal sensors, droop sensors, and other on-chip sensors.

In some examples, the module 101 includes the processor 100, the VRM140, nonvolatile memory units 150, dynamic random access memory (DRAM)144, an ambient sensor 142, and external package pins (not shown), aswell as other components that are omitted from FIG. 1 in the interest ofclarity. The VRM 140 is coupled to a power supply and controls the VDDto processor 100. The nonvolatile memory units 150 may be ROM, EPROM,EEPROM, or Flash memory units. For example, like the nonvolatile memoryunits 176, the nonvolatile memory units 150 may also store firmware orreference data.

In some examples, as depicted, the example processor 100 is organizedinto sixteen core units 102, each core unit 102 including a core regionand an L3 cache region. Each core region includes a processor core 108and an L2 cache 110, as well as one or more core power headers 112 andone or more L2 cache power headers 114 that are each independentlycontrolled. The core power headers 112 and L2 cache power headers 114relay a voltage supply to discrete sections of the core 108 and the L2cache 110, and further act as switches to turn power on and off forthese sections. The L3 cache region includes an L3 cache 106 and one ormore L3 cache power headers 116 that are each independently controlled.The L3 cache power headers 116 relay a voltage supply to discretesections of the L3 cache 106, and further act as switches to turn poweron and off for these sections. The core power headers 112, L2 cachepower headers 114, and L3 cache power headers 116 receive VDD from theVRM 140.

In some examples, the example processor 100 is organized into tiles 118for thermal and power management. In the example depicted in FIG. 1 ,the processor 100 includes eight tiles 118 that each include two coreunits 102. Each tile 118 also includes a tile management engine 120 thatis an embedded microcontroller controlling the core power headers 112,L2 power headers 114, and the L3 cache power headers 116 includingtransitioning the cores to power save states by, for example, turningthe power headers 112, 114, 116 off or stopping region clocks. Forexample, using an architected STOP instruction, an operating system mayrequest a core 108 to enter a power save state. The core signals thetile management engine 120 to perform this transition, which can includestopping region clocks or powering off regions. The tile managementengine 120 also performs the reverse operations to bring the core 108out of power save states in reaction to system interrupts or otherwakeup events. The tile management engines 120 are connected via asideband communications bus 122 to the on-chip controller 124 to whichthe tile management engines 120 communicate core power save states andclock power states.

In some examples, as depicted, each core unit 102 includes multipledigital thermal sensors 130 (also referred to herein as a DTS)configured to detect a temperature within a particular area of the coreunit 102. For example, the cores 108, L2 cache 110, and L3 cache 106 mayeach include one or more digital thermal sensors 130. The module 101also includes an ambient sensor 142 that collects ambient conditioninformation such as, for example, an ambient temperature and an ambientaltitude. Readings from the digital thermal sensors 130 and the ambientsensor 142 are relayed to one or more controllers of the processor 100such as the on-chip controller 124. In some implementations, readingsfrom the digital thermal sensors 130 and the ambient sensor 142 arecollected by the on-chip controller 124, for example, through theon-chip infrastructure interface 156.

As discussed above, the system clock 160 provides the clock for the coreunits 102 through a clock network (not shown). In some examples, allcore units 102 use the same clock values from the system clock 160. Insome examples, each core uses a dedicated clock frequency source. Inother examples, there are a number of core clock sources, with differentsubsets of cores using different clock sources. As will be explained inmore detail below, the system clock 160 is configured to operate a clockfrequency of the processor 100 based on a value set by the on-chipcontroller 124. For example, the on-chip controller 124 may set a clockregister with a value for a target clock frequency. In other examples,there may be multiple registers and multiple frequencies for subsets ofcores, or for each individual core. For simplicity, the descriptionsbelow describe a processor with a single clock source used for allcores. This can be easily generalized so that different cores or subsetsof cores can have different clock frequency choices and differentvoltages or voltage-control loops.

To facilitate deterministic frequency and voltage enhancements inaccordance with the present disclosure, the MVPD 164 includes processorcharacterization data that is generated at manufacture and written intoa nonvolatile memory unit on the processor 100 or module 101. In someexamples, the MVPD 164 includes AC current, DC current, and leakagecurrent measured while the processor 100 is under a thermal design point(TDP) workload. These measurements are taken at curve fit points withina target operating frequency and VDD range. In some examples, digitaldroop sensor calibration values (described in more detail below) arealso recorded for these curve fit points. In other examples, anoff-module memory, such as an external ROM or disk-drive, containsparameters that are copied into RAM on the module or chip for frequencyand voltage control purposes.

In some examples, as mentioned above, the on-chip controller 124implements the WOF controller 152 through the execution of firmwareincluding executable microcode that embodies the WOF controller 152.However, it should be appreciated that the WOF controller 152 may alsobe implemented in digital logic of an integrated circuit, asprocessor-executable software, in in some other form. The WOF controller152 carries out a WOF control loop that executes iteratively (e.g.,every 500 microseconds). In some implementations, the WOF control loopincludes a collect phase, a compute phase, a lookup phase, and anactuate phase.

During the collect phase, the WOF controller 152 identifies parametersgathered by the on-chip controller 124 from the VRM 140, tile managementengines 120, digital thermal sensors 130, ambient sensor 142, and so on.For example, the WOF controller 152 identifies a VDD current (IDD) forthe processor that is read by the on-chip controller 124 from thevoltage regulator module 140. As another example, the WOF controller 152identifies on-die temperatures provided to the on-chip controller 124from the digital thermal sensors 130. As yet another example, the WOFcontroller 152 identifies average clock and power states for all coresand caches (i.e., whether clocks or power are off in those regions) thatare provided to the on-chip controller 124 by the tile managementengines 120. As yet another example, the WOF controller 152 identifiesan I/O bus configuration of the processor that is determined by theon-chip controller 124 through polling of the I/O interfaces. As yetanother example, the WOF controller 152 identifies ambient temperatureand altitude reported to the on-chip controller 124 from the ambientsensor 142. Further, the WOF controller 152 identifies the average VDDvoltage and the frequency recorded over the last WOF cycle (e.g., 500microseconds).

During the compute phase, the WOF controller 152 generates one or moreindexes for use with a predefined frequency clip table. In someexamples, a primary index is an effective switching capacitance(C_(eff)) ratio. Generally, power consumed by the processor includesactive power (the result of gate switching at the clock frequency) andstatic power (the result of leakage due to silicon process). The activepower can be expressed by equation 1:

P_(active)= C_(eff) V^(k)f = I_(active) V

Where V is VDD, ƒ is the processor frequency, and k is a technologydependent factor that is commonly ‘2,’ although varies due to VDDsensitivity to capacitance. Thus, effective switching capacitance forthe active workload can be expressed by equation 2:

C_(eff-active)  =I_(active)/ Vf

Where I_(active) is IDD less IDD quiescent. Thus, the C_(eff-active) maybe computed using the IDD parameter from the VRM 140 and leakage datafrom MVPD, as well as the average VDD and frequency recorded over thelast WOF cycle. This provides a workload metric that can be compared tothe TDP. An effective switching capacitance for the TDP workload may becalculated using the AC current value for the same voltage and frequencyfrom the TDP characterization curve represented in the MVPD 164, therebynormalizing out process, voltage and frequency. Thus, the effectiveswitching capacitance ratio is expressed by equation 3:

C_(eff-ratio) = C_(eff-active)/C_(eff-tdp)

When C_(eff-active) is less than C_(eff-tdp), and thus the C_(eff-ratio)decreases, the resulting power credit allows core frequency to beincreased.

In some examples, a second index computed by the WOF controller 152 is acore activity state. The core activity state is a ratio of the averagetime the cores are active, clocked off, or powered off, relative tofully active, resulting in a power credit. The core activity state canbe computed based on core STOP states that are reported by the tilemanagement engines 120. For example, if the WOF controller 152identifies that, on average, the core regions are only 80% active, apower consumption modeled for fully active cores in the TDP definitioncan be reduced by 20%, which can allow for an additional frequencyboost.

In some examples, a third index computed by the WOF controller 152 isI/O power. In these examples, the WOF controller 152 uses theruntime-sampled bus configuration, which can be identified by theon-chip controller 124 polling the I/O interfaces, and an I/O powerproxy table 166 to identify the associated power parameter for each linktype (e.g., memory, PCIe, SMP) of the current I/O configuration. Thesepower parameters may be accumulated into a single index. When thecurrent I/O configuration uses less power than the I/O configurationmodeled for an TDP definition, the resulting power credit can allow foran additional frequency boost. The I/O power proxy table 166 may bestored in the reference data 162 or incorporated into the MVPD 164, oras separate reference data in a different nonvolatile memory unit.

In some examples, a fourth index computed by the WOF controller 152 isan ambient condition index. The ambient condition index adjusts for theambient room temperature and altitude to give a thermal cooling creditto the expected TDP definition. For example, the WOF controller 152 mayprovide N watts of power credit per degree Celsius that the ambienttemperature is below a reference value used for the TDP definition. TheWOF temperature component is dependent on sensing at the systemair-intake, not within the processor, which provides deterministicbenefit (i.e., not sensitive to manufacturing variations). The altitudecomponent stems from higher density air improving the effectiveness ofheatsink cooling at altitudes less than 1000 meters. The ambientcondition index can be computed based on ambient temperature andaltitude parameters reported by the ambient sensor 142 to the on-chipcontroller 124.

During the lookup phase, the WOF controller 152 utilizes the computedindexes to identify an optimized operating frequency. In some examples,the WOF controller 152 identifies a frequency from one or more WOFfrequency clip tables stored in the MVPD 164. To generate the WOF cliptables, each point in the WOF operating space across all four dimensionsdescribed above is simulated and stored in the MVPD. This ensuresdeterministic frequency behavior for all modules of a given product. Ineach frequency clip table, frequency is plotted as a function of theprimary index C_(eff-ratio) along a primary curve, with the TDPfrequency occurring at a C_(eff-ratio) of ‘1.’ Lighter workloads thanthe TDP model leads to decreased C_(eff-ratio) and higher frequency,whereas heavier workloads than the TDP model leads to increasedC_(eff-ratio) and lower frequency. For each additional dimension, afrequency clip table for that dimension includes secondary curvesplotted based on that dimensional index. The C_(eff-ratio) calculatedcauses the operating frequency of the module to move along the x-axiswhile the other indices apply secondary frequency adjustments to theprimary curve along the y-axis. The result is a WOF frequency that theprocessor can boost to within the system power delivery, thermalcooling, and technology voltage limits.

For further explanation, FIG. 2 sets forth a graphical representation ofan example frequency clip table 200 for a core activity state inaccordance with some embodiments of the present disclosure. Thefrequency clip table 200 includes a primary curve of core frequency vs.the primary index C_(eff-ratio) for the TDP model in which all cores arefully active. Secondary curves are provided for additional models inwhich a varying number of cores are disabled. Where cores are not fullyactive during the measurement interval, the core activity state may beused to index a secondary curve that is shifted up along the y-axis,thus yielding a frequency boost.

For further explanation, FIG. 3 sets forth a graphical representation ofan example frequency clip table 300 for I/O power in accordance withsome embodiments of the present disclosure. The frequency clip table 300includes a primary curve of core frequency vs. the primary indexC_(eff-ratio) for a TDP model using a reference I/O configuration.Secondary curves are provided for additional models in which more orless I/O power is required based on the I/O configuration. Where lessI/O power than the reference model is utilized, the I/O power may beused to index a particular secondary curve that is shifted up along they-axis, thus yielding a frequency boost. Where more I/O power than thereference model is utilized, the I/O power may be used to index aparticular secondary curve that is shifted down along the y-axis, thusrequiring a lower frequency.

For further explanation, FIG. 4 sets forth a graphical representation ofan example frequency clip table 400 for ambient conditions in accordancewith some embodiments of the present disclosure. The frequency cliptable 400 includes a primary curve of core frequency vs. the primaryindex C_(eff-ratio) for a TDP model using a reference ambienttemperature. Secondary curves are provided for additional models inwhich the ambient temperature is higher or lower than that used for aTDP reference model. Where the ambient temperature is lower than themodel, the ambient conditions index may be used to index a particularsecondary curve that is shifted up along the y-axis, thus yielding afrequency boost. Where the ambient temperature is lower than the model,the ambient conditions index may be used to index a particular secondarycurve that is shifted down along the y-axis, thus requiring a lowerfrequency.

Returning to FIG. 1 , during the actuation phase, the WOF controller 152transitions the processor 100 to a new target frequency based in part onthe frequency from the lookup phase. For example, the WOF controller 152may utilize the minimum of (a) the frequency from the lookup phase, (b)thermal limits and (c) software directed requests to adjust thefrequency of the system clock 160. In some examples, the WOF controller152 uses the frequency parameter identified from the lookup phase to seta frequency for the system clock 160 by writing a value of the frequencyto a register of the system clock 160 or otherwise indicating thefrequency to the system clock 160. In some examples, the WOF controller152 also uses the frequency parameter identified from the lookup phaseto identify a corresponding target VDD from the MVPD 164. In theseexamples, the WOF controller 152 signals the VRM 140 to adjust the setVDD to the target VDD.

As described above, the WOF controller 152 provides deterministicdynamic frequency and voltage scaling in response to changing workloadson the processor. These changing workloads may also cause power supplydroops that can potentially cause critical failures in the processorcores, particularly where an intense load is placed on the processorafter a relatively idle state. Droop mitigation techniques may beemployed to safeguard against such critical failures. While reducingclock frequency alone can mitigate droops with less incrementalperformance effect per droop, this may be incompatible with the responsetimes associated with a global clock architecture where all active coresand caches share a single clock to improve cache latency and coherency.In accordance with some embodiments of the present disclosure, powersupply droop mitigation is accomplished by sensing droops with digitaldroop sensors in each core.

FIG. 5 sets forth a block diagram of an example droop mitigation systemin accordance with some embodiments of the present disclosure. The droopmitigation system is shown in the context of an example core 508 (e.g.,a core 108 described above) that includes one or more digital droopsensors 532 (also referred to herein as a ‘DDS’) configured to detect avoltage droop in the processor core 508. Each digital droop sensor 532is coupled to a throttle sequencing engine 536, which is coupled to oneor more core throttle controllers 538. Thus, each digital droop sensor532 is configured to issue commands to the throttle sequencing engine536, which in turn issues core throttling commands to the core throttlecontrollers 538 to throttle instruction processing in the core 508.

Starting from system idle to a high intensity workload on the processorcores cause significant gate switching and creates a large current draw.The local capacitance is not sufficient to maintain the initial voltageand the core voltage (V_(core)) droops suddenly, until adequate currentcan be provided to compensate for the increased load. After the voltagedroop, the final V_(core) reaches a steady state value dictated by thesystem loadline. Utilizing droop sensors, the core can engage droopmitigation when V_(core) drops below a threshold. Core throttling ischosen to restrict throughput, reducing latch and data switching rates.The core throttle settings are programmable and characterized todetermine values that will quickly stop the droop as well assubsequently recover full instruction execution in a controlled manner.Simply turning off the throttle would generate another droop as thesystem transitions from idle back to the heavy workload. Takingadvantage of this droop mitigation, VDD can be reduced to maintain thesame timing margin at the bottom of the mitigated droop. After acontrolled return from throttling, the final workload induced voltagehas a similar loadline cost as the non-throttled case. However, both VDDand the die circuit voltage (V_(core)) are reduced through the savingsprovide by droop mitigation.

In some examples, each digital droop sensor 532 contains a programmabledelay feeding a latch-tapped-delay-line with 24 output latches. The 24output latches produce a thermometer code value proportional to thetiming margin for the previous 2 clock cycles. When voltage droops aredetected by a digital droop sensor 532, core instruction rates arebriefly throttled in that core to reduce current, thus stopping droopsto protect a timing margin.

In some examples, each digital droop sensor 532 is calibrated duringmanufacturing test to ensure appropriate timing margin protection ofprocessor critical timing paths. In an example calibration process, theVDD_(MIN) of all good cores is determined at each of up to 8frequencies. Then, starting from each frequency point, the digital droopsensor 532 calibration state is entered: frequency is reduced, andvoltage is increased to provide guard band over the short-durationVDD_(MIN), and the same workload is executed. The programmable delay ofevery digital droop sensor 532 is adjusted to output the same minimumthermometer-code value M. Mis chosen to allow observation of typicaloperation as well as worst case droops within the 24-bit thermometercode. For example, when the two digital droop sensors 532 in each corealways read a value of M or higher, that core has the desired timingmargin. The next step of calibration pulses a high-current workload onall cores to create maximal droops, during which the DDS trigger thatinitiates throttling is tuned to prevent DDS monitoring in any core todrop below M. Since the droop mitigation may take, for example,approximately 5 nanoseconds to detect and stop a droop, throttlingbegins at a DDS value of M+N, where the trigger threshold N is chosen toguarantee no digital droop sensor 532 ever reads a value below Mregardless of the starting VDD. N may be uniquely chosen for eachfrequency point tested. Once each digital droop sensor 532 is calibratedand the trigger threshold determined, DDS droop mitigation protectstiming margin over a wide range of VID values and workloads at eachdesired target frequency.

Depending on the global clocking architecture and design, as well as theresponse time to adapt the clock frequency compared to the response timeto actuate core throttling, either droop-throttling or adaptive clockmethods, or both, may be optimal to provide robust droop mitigation overthe required range of voltages at each desired core clock targetfrequency. In other embodiments, droop-mitigation uses current-injectionfrom nearby capacitance charged to a higher voltage or otherswitched-capacitor schemes.

Regardless of the specific droop-mitigation method, when robustdroop-mitigation is enabled, the cores are protected from functionalerrors over a wider range of regulator VDD setpoints. This robust timingprotection enables a dynamic voltage adjustment, since if the VDDsetpoint is temporarily too low, the droop mitigation will preventerrors. This may result in temporary performance loss, but this can becorrected quickly. In the last step, voltage is adjusted while aproduct-reference workload is run to find the lowest VDD value thatreduces DDS detection and throttling to rates that have no significanteffect on processor performance. This final voltage is then used forpower measurements at each frequency and may be written into MVPD. WOFcontrol loop computations rely on this voltage as a primary referencefor runtime power and frequency optimization. When managing dynamicvoltage and frequency slewing (DVFS), the WOF control loop interpolatesthe calibration delays, DDS trigger values, and relative voltagesbetween the MVPD content associated with each frequency point. Since theDDS delay is sensitive to temperature, cross-chip voltage gradients, andend of life degradation much like critical circuit delays, the trackingbetween DDS and the critical paths enables reduction in the associatedguard bands. In some examples, the VDD written in the MVPD is used onlyas a starting point during the boot process, after which the VDD isdynamically optimized. In some examples, the WOF controller 152 signalsan updated DDS throttling trigger threshold N to one or more digitaldroop sensor 532 in response to transitioning the processor 100 to a newfrequency, where the updated DDS trigger value for that frequency isderived from the MVPD 164.

For most workloads where the C_(eff) changes relatively rarely byrelatively small amounts, a very simple control loop is sufficient. Forsome problematic highly variable workloads, there may be increasedaverage performance loss for an extended period of time. In that casethe simple control loop can be supplemented using machine learningmethods such as Reinforcement Learning, to reduce performance loss whileremaining within Vmax or Power limits or other constraints.

Returning to FIG. 1 , the firmware unit 150 also stores microcodeinstructions 192 for an undervolt controller 158 that are executable bythe core 126 of the on-chip controller 124 to carry out an undervoltcontrol loop. In some examples, the undervolt control loop utilizes thedroop sensors (e.g., the digital droop sensors 532 of FIG. 5 ) tofurther lower voltages even further by dynamically adjusting the VDD tothe minimum value that prevents performance loss. Initially, the VDD isreduced during system idle based on droop sensor feedback. When a highintensity workload starts, the voltage quickly drops below the throttlethreshold and core throttling engages to maintain the voltage margin. Asthrottling eases off, the VDD is still too low to provide enough currentto maintain a V_(core) above the throttle threshold, which re-engagescore throttling. However, the undervolt control loop detects thisiterative core throttling and raises the VDD incrementally until droopevents have no impact on performance. The voltage oscillates around thefloor voltage while VDD is increased and throttling is active. Bycontinuously optimizing the required voltage to provide just enough toavoid circuit failure, the undervolt control loop reduces V_(core)voltages significantly, enabling higher clock frequencies while stayingbelow the VDD_(MAX) limit, as well as current and power limits.

In some examples, the undervolt control loop leverages droop indicationsfrom digital droop sensors (e.g., the digital droop sensors 532 of FIG.5 ) to continuously adjust VDD voltages. Although a digital droop sensor532 injects throttles to the core in the short term, if droops repeat,the undervolt control loop raises VDD to prevent any performance effect.While the undervolt control loop is referred to ‘undervolt,’ theundervolt control loop may temporarily increase VDD above the referencevalue in the MVPD if required. In some examples, VDD is controlled bycounting the number of cycles a measured digital droop sensor 532 valueis at or below a chosen digital droop sensor value (separate from thevalue that initiates throttling by the digital droop sensor 532). If adigital droop sensor 532 reads at or below this value for a number ofmeasuring instances above a predetermined threshold, the VDD voltage isincreased. Otherwise, the voltage is decreased at a particular interval(e.g., every 125 microseconds). When the WOF control loop initiates afrequency transition, throttling carried out by the digital droop sensor532 maintains a timing margin, while the undervolt control loopoptimizes VDD to the lowest value that avoids any performance effect.

In some examples, VDD is controlled by counting the number of cycles theDDS value is at or below a chosen DDS undervolting bin B_(UNDERVOLT).This undervolting bin can be chosen to be N_(diff) bins aboveB_(THROTTLE) that elicits core throttling and droop mitigation by thedigital droop sensors 532. If the digital droop sensor reads at or belowthis value too often, VDD is increased. Otherwise, the VDD is decreasedincrementally (e.g., every 125 microseconds).

It will be appreciated that the combined effect of the WOF control loopand the undervolt control loop to manage frequency and voltage allow foran increase in frequency and a reduction in VDD with negligibleperformance loss. In particular, utilization of the ambient index andI/O configuration index by the WOF control loop, as described above,allows for an increase in frequency for a workload, while droopmitigation by the digital droop sensors 532 reduces the voltageguardband, also allowing for an increase in frequency. The undervoltcontrol loop dynamically reduces VDD below VDD_(MAX), which also savespower and allows for an increase in the maximum frequency.

FIG. 6 sets forth a flow chart illustrating an example method ofproviding deterministic frequency and voltage enhancements for aprocessor in accordance with some embodiments of the present disclosure.The example method of FIG. 6 includes identifying 602 a plurality ofparameters 605 related to a processor 603, the plurality of parameters605 including at least a current supplied to the processor. In someexamples, a microcontroller 601 of a processor 603 identifies parameters605 related to the processor 603 including the current supplied to theprocessor 603 as well as other parameters 605 such as on-dietemperatures, I/O configuration, processor core stop states includingwhether a particular core is clocked off or powered off, ambientconditions such as an off-die temperature and an altitude, an averagevoltage, and an average frequency. For example, the microcontroller 601may identify a VDD current (IDD) for the processor 603 that is read bythe microcontroller 601 from a voltage regulator module. As anotherexample, the microcontroller 601 may identify on-die temperaturesprovided to the microcontroller 601 from the digital thermal sensors inthe processor 603. As yet another example, the microcontroller 601 mayidentify average clock and power states for local regions (i.e., whetherclocks or power are off in those regions) from locally embeddedmicrocontrollers. As yet another example, the microcontroller 601 mayidentify an I/O bus configuration through polling of the I/O interfaces.As yet another example, the microcontroller 601 may identify an ambienttemperature and/or an altitude from an ambient sensor. Further, themicrocontroller 601 may record an average VDD voltage, sampled from thevoltage regulator module, and the frequency, sampled from the systemclock, over the preceding collection interval (e.g., 500 microseconds).

The method of FIG. 6 also includes determining 604, in dependence uponthe plurality of parameters 605, one or more frequency scaling indexes607 including determining an effective switching capacitance ratio. Insome examples, the microcontroller 601 determines one or more frequencyscaling indexes 607 using one or more of the parameters 605 includingdetermining the effective switching capacitance ratio. For example, thecurrent supplied to the processor 603, the average voltage supplied tothe processor 603, and the average frequency of the processor 603, fromamong the plurality of parameters 605 can be used to determine theactive effective switching capacitance, as described above with respectto equation 2. A thermal design point effective switching capacitancefor a modeled workload can be determined using an AC current, DCcurrent, and leakage value obtained from reference data, such as MVPD,for the same voltage and frequency and using the same equation. Theeffective switching capacitance ratio can be determined as the ratio ofthe active effective switching capacitance and the thermal design pointeffective switching capacitance (e.g., using equation 3 above). In someexamples, the effective switching capacitance ratio is used as an indexto a frequency scaling table. In other examples, the effective switchingcapacitance ratio may be used as an input to a frequency scalingequation that models frequency as a function of the effective switchingcapacitance ratio.

The method of FIG. 6 also includes identifying 606, in dependence uponthe one or more frequency scaling indexes 607, a predetermined frequencyparameter 609 for the processor 603. In some examples, themicrocontroller 601 identifies the predetermined target frequency byusing the effective switching capacitance ratio to index into afrequency scaling table. In one example, the frequency scaling tableidentifies an optimal frequency value that has been predetermined for aparticular effective switching capacitance ratio. For example, theoptimal frequency value may indicate the highest frequency (expressedabsolutely or as a ratio of the current frequency) that does not violatepower and thermal constraints of the processor for a workload whosemetric is expressed by the effective switching capacitance ratio. In oneexample, a priori simulation of the processor is employed to identifythe optimal frequency value for each effective switching capacitanceratio value. The result is a deterministic relationship between theactivity of a workload relative to a thermal design point (indicated bythe effective switching capacitance ratio) and an optimal clockfrequency for that workload. For example, the frequency scaling tablemay indicate that a workload operating at 90% of the thermal designpoint (i.e., an effective switching capacitance ratio of 0.9) allows fora 3% boost in frequency. While the effective switching capacitance ratiois the primary index for the frequency scaling table, additionalsecondary indexes may also be used to identify the optimal frequencyvalue for the workload, as will be described in more detail below. Forexample, an optimal frequency value may be provided for each combinationof two or more indexes, as determined through processor simulation.Although frequency scaling tables are described, it will be appreciatedthat a frequency scaling equation (e.g., derived from simulations ormodels of the processor) may also be used to identify an optimalfrequency value based on the effective switching capacitance ratio.

The method of FIG. 6 also includes transitioning 608, based on thefrequency parameter 609, the processor 603 to a target clock frequency.In some examples, the microcontroller 601 transitions the processor to atarget clock frequency by indicating a value of the target clockfrequency to the system clock. For example, the microcontroller 601 mayindicate the value by writing the value for the target clock frequencyinto a register used by the system clock. In some examples, themicrocontroller 601 transitions the processor to the target clockfrequency in frequency steps over a particular number of cycles or timeincrement.

For further explanation, FIG. 7 sets forth a flow chart illustratinganother example method of providing deterministic frequency and voltageenhancements for a processor in accordance with some embodiments of thepresent disclosure. Like the example method of FIG. 6 , the examplemethod of FIG. 7 also includes identifying 602 a plurality of parameters605 related to a processor 603, the plurality of parameters 605including at least a current supplied to the processor; determining 604,in dependence upon the plurality of parameters 605, one or morefrequency scaling indexes 607 including determining an effectiveswitching capacitance ratio; identifying 606, in dependence upon the oneor more frequency scaling indexes 607, a predetermined frequencyparameter 609 for the processor 603; and transitioning 608, based on thefrequency parameter 609, the processor 603 to a target clock frequency.

In the example method of FIG. 7 , determining 604, in dependence uponthe plurality of parameters 605, one or more frequency scaling indexes607 further includes determining 702 a core activity state index. Asdiscussed above, the core activity state index is based on core powerand clock states collected from embedded power controllers, and can beutilized to further adjust the optimal frequency value that isdetermined based on the effective switching capacitance by, for example,correlating a core activity ratio to a predetermined power credit. Insome examples, the microcontroller 601 determines 702 a core activitystate index based on a ratio of the average time the cores are active,clocked off, or powered off, relative to fully active over apredetermined time interval. For example, the microcontroller 601 maycompute the core activity state based on reported core STOP states. Thiscore activity ratio is indicative of power consumption across theprocessor and whether cores may be disabled without affectingperformance. For example, given a processor having ten core regions thatare averaged to only 80% active, it may be the case that two coreregions may be powered off without affecting performance. Thus, a powerconsumption model for fully active cores in the TDP definition can bereduced by 20%, which can allow for an additional frequency boost. Insome examples, the core activity state index is a secondary index thatis combined with the effective switching capacitance ratio (the primaryindex) to identify an optimal frequency value from a frequency scalingtable. The frequency scaling table may be created by simulating allcombinations of effective switching capacitance ratio and core activitystate. These two indexes may also be combined with additional indexesand combinations of indexes to index into the frequency scaling table.

For further explanation, FIG. 8 sets forth a flow chart illustratinganother example method of providing deterministic frequency and voltageenhancements for a processor in accordance with some embodiments of thepresent disclosure. Like the example method of FIG. 6 , the examplemethod of FIG. 8 also includes identifying 602 a plurality of parameters605 related to a processor 603, the plurality of parameters 605including at least a current supplied to the processor; determining 604,in dependence upon the plurality of parameters 605, one or morefrequency scaling indexes 607 including determining an effectiveswitching capacitance ratio; identifying 606, in dependence upon the oneor more frequency scaling indexes 607, a predetermined frequencyparameter 609 for the processor 603; and transitioning 608, based on thefrequency parameter 609, the processor 603 to a target clock frequency.

In the example method of FIG. 8 , determining 604, in dependence uponthe plurality of parameters 605, one or more frequency scaling indexes607 further includes determining 802 an input/output power index. Asdiscussed above, the I/O power index is based on a runtime-sampled I/Oconfiguration collected by polling I/O interfaces to determine linkstates, and can be utilized to further adjust the optimal frequencyvalue that is determined based on the effective switching capacitanceby, for example, correlating runtime I/O power to a predetermined powercredit. In some examples, the microcontroller 601 determines 802 an I/Opower index based on a runtime-sampled bus configuration, which can beidentified by the microcontroller 601 polling the I/O interfaces. Insome examples, the microcontroller 601 further determines 802 the I/Opower index based on an I/O power proxy table 166 to identify theassociated power parameter for each link type (e.g., memory, PCIe, SMP)of the current I/O configuration. The I/O power proxy table may bestored in the reference data or incorporated into the MVPD. These powerparameters may be accumulated into a single index. When the current I/Oconfiguration uses less power than the I/O configuration modeled for aTDP definition, the resulting power credit can allow for an additionalfrequency boost. In some examples, the I/O power index is a secondaryindex that is combined with the effective switching capacitance ratio(the primary index) to identify an optimal frequency value from afrequency scaling table. The frequency scaling table may be created bysimulating all combinations of effective switching capacitance ratio andI/O power index. These two indexes may also be combined with additionalindexes and combinations of indexes to index into the frequency scalingtable.

For further explanation, FIG. 9 sets forth a flow chart illustratinganother example method of providing deterministic frequency and voltageenhancements for a processor in accordance with some embodiments of thepresent disclosure. Like the example method of FIG. 6 , the examplemethod of FIG. 9 also includes identifying 602 a plurality of parameters605 related to a processor 603, the plurality of parameters 605including at least a current supplied to the processor; determining 604,in dependence upon the plurality of parameters 605, one or morefrequency scaling indexes 607 including determining an effectiveswitching capacitance ratio; identifying 606, in dependence upon the oneor more frequency scaling indexes 607, a predetermined frequencyparameter 609 for the processor 603; and transitioning 608, based on thefrequency parameter 609, the processor 603 to a target clock frequency.

In the example method of FIG. 9 , determining 604, in dependence uponthe plurality of parameters 605, one or more frequency scaling indexes607 further includes determining 902 an ambient conditions index. Asdiscussed above, the ambient conditions index is based on data collectedfrom sensors measuring conditions outside of the processor or processormodule, and can be utilized to further adjust the optimal frequencyvalue that is determined based on the effective switching capacitanceby, for example, correlating one or more ambient conditions parametersto a predetermined power credit. In some examples, the microcontroller601 determines 902 an ambient conditions index based on one or moreambient conditions parameters such as ambient temperature and/oraltitude. The ambient condition index adjusts for the ambient roomtemperature and altitude to give a thermal cooling credit to theexpected TDP definition. For example, for each degree Celsius that theambient room temperature is below a reference value used for the TDPdefinition, TV watts of power credit may be awarded. This temperaturecomponent is dependent on sensing at the system air-intake, not withinthe processor, which provides deterministic benefit (i.e., not sensitiveto manufacturing variations). For example, an ambient temperature sensormay be an off-die sensor, such as a sensor that is coupled to theprocessor module or another component in the processor chassis, or theambient temperature may be relayed by the operating system. The altitudecomponent stems from higher density air improving the effectiveness ofheatsink cooling at altitudes less than 1000 meters. In some examples,the ambient conditions index is a secondary index that is combined withthe effective switching capacitance ratio (the primary index) toidentify an optimal frequency value from a frequency scaling table. Thefrequency scaling table may be created by simulating all combinations ofeffective switching capacitance ratio and ambient conditions. These twoindexes may also be combined with additional indexes and combinations ofindexes to index into the frequency scaling table.

For further explanation, FIG. 10 sets forth a flow chart illustratinganother example method of providing deterministic frequency and voltageenhancements for a processor in accordance with some embodiments of thepresent disclosure. Like the example method of FIG. 6 , the examplemethod of FIG. 10 also includes identifying 602 a plurality ofparameters 605 related to a processor 603, the plurality of parameters605 including at least a current supplied to the processor; determining604, in dependence upon the plurality of parameters 605, one or morefrequency scaling indexes 607 including determining an effectiveswitching capacitance ratio; identifying 606, in dependence upon the oneor more frequency scaling indexes 607, a predetermined frequencyparameter 609 for the processor 603; and transitioning 608, based on thefrequency parameter 609, the processor 603 to a target clock frequency.

In the example method of FIG. 10 , transitioning 608, based on thefrequency parameter 609, the processor 603 to a target clock frequencyfurther includes setting 1002, based on the target clock frequency, atarget power supply voltage for the processor. Once a new targetfrequency is determined, the microcontroller 601 also adjusts the VDD ofthe processor 603 based on the new clock frequency. In some examples,the microcontroller 601 determines a new VDD for the processor based onreference data (e.g., MVPD) generated during processor characterization.For example, the reference data may map frequency values to a VDD forthe processor. In some examples, the microcontroller 601 sets the powersupply voltage for the processor 603 by indicating the new power supplyvoltage to a voltage regulator module. For example, the microcontroller601 may write a new VDD value to a register used by the voltageregulator module to control VDD for the processor 603.

For further explanation, FIG. 11 sets forth a flow chart illustratinganother example method of providing deterministic frequency and voltageenhancements for a processor in accordance with some embodiments of thepresent disclosure. Like the example method of FIG. 6 , the examplemethod of FIG. 11 also includes identifying 602 a plurality ofparameters 605 related to a processor 603, the plurality of parameters605 including at least a current supplied to the processor; determining604, in dependence upon the plurality of parameters 605, one or morefrequency scaling indexes 607 including determining an effectiveswitching capacitance ratio; identifying 606, in dependence upon the oneor more frequency scaling indexes 607, a predetermined frequencyparameter 609 for the processor 603; and transitioning 608, based on thefrequency parameter 609, the processor 603 to a target clock frequency.

As discussed above, an undervolt control loop decreases processorvoltage to the lowest value that does not impact processor performance.Accordingly, the example method of FIG. 11 further includes decreasing1102, incrementally, the power supply voltage for the processor 603. Insome examples, the microcontroller 601 decreases 1102, incrementally,the power supply voltage for the processor 603 by decreasing the VDD forthe processor by V volts every T microseconds (e.g., 125 microseconds).For example, VDD may decrease during a processor idle state that isdetermined by monitoring the outputs of digital droop sensors. In someexamples, the microcontroller 601 decreases the power supply voltage byindicating a new power supply voltage to a voltage regulator module. Forexample, the microcontroller 601 may write a new VDD value to a registerused by the voltage regulator module to control VDD for the processor603.

As discussed above, digital droop sensors (e.g., digital droop sensors532 in FIG. 5 ) output a value that indicates a voltage droop (i.e., adifference between a supplied voltage VDD and the voltage in the core).Accordingly, the example method of FIG. 11 also includes determining1104 that a voltage droop parameter exceeds a voltage droop parameterthreshold. For example, a droop counter may be updated each time aparticular droop sensor outputs, per cycle, a value that is below aparticular level. In some examples, the microcontroller 601 determines1104 that the voltage droop parameter exceeds a voltage droop parameterthreshold by monitoring the droop rate (i.e., the number of cycles perwindow of cycles that an undervoltage voltage droop is detected) that isbased on the outputs of digital droop sensors and comparing the drooprate to a programmable threshold. In some examples, the format output bythe digital droops sensors is thermometer code, and a voltage droopcounter is updated each time the thermometer code value is below aparticular undervolting level. This undervolting level may be higherthan a throttling level that triggers voltage droop mitigation. In someexamples, the droop rate threshold for undervolting may be updated bythe microcontroller 601 in response to a target frequency transitioninitiated as part of the WOF control loop discussed above.

In some cases, a droop counter may be incremented for every cycle thatthe droop-mitigation action is actuated.

The example method of FIG. 11 also includes increasing 1106,incrementally, the power supply voltage in response to determining thatthe voltage droop parameter exceeds a voltage droop parameter threshold.In some examples, the microcontroller 601 increases 1106 the powersupply voltage incrementally by increasing VDD incrementally in responseto determining that the droop rate (e.g., the number of cycles perwindow of cycles that an undervolting droop is identified) is higherthan a predetermined droop rate threshold. In some examples, themicrocontroller 601 increases the power supply voltage by indicating anew power supply voltage to a voltage regulator module. For example, themicrocontroller 601 may write a new VDD value to a register used by thevoltage regulator module to control VDD for the processor 603.

In some embodiments, the voltage droop parameter is based on droopmitigation actions, i.e., an event where a detected voltage drooptriggers a droop mitigation action such as core throttling as discussedbelow. Thus, the voltage droop parameter may be a number of droopmitigation actions or a rate of droop mitigation actions, and thevoltage droop parameter threshold may be a threshold number or rate ofdroop mitigation actions. In some examples, when droop mitigation isenabled, the voltage is increased when the number or rate of droopmitigation actions exceeds the threshold number or rate. Alternatively,the voltage can be increased when the number or fraction of cycles thatthe droop mitigation action is active exceeds a threshold. The voltageis decreased when the number or rate of mitigation actions is below athreshold number or rate. Alternatively, the voltage can be decreasedwhen the number or fraction of cycles that the droop mitigation actionis active exceeds a threshold.

In some variations, a size of a power supply voltage increment may bedynamically selected based on the voltage droop parameter. Likewise, asize of a power supply voltage decrement may be dynamically selectedbased on the voltage droop parameter. For example, the size of thevoltage increment or decrement may be chosen based on the number or rateof droop mitigation actions/events or the number or rate of cycles withdroop-mitigation active.

For further explanation, FIG. 12 sets forth a flow chart illustratinganother example method of providing deterministic frequency and voltageenhancements for a processor in accordance with some embodiments of thepresent disclosure. Like the example method of FIG. 6 , the examplemethod of FIG. 12 also includes identifying 602 a plurality ofparameters 605 related to a processor 603, the plurality of parameters605 including at least a current supplied to the processor; determining604, in dependence upon the plurality of parameters 605, one or morefrequency scaling indexes 607 including determining an effectiveswitching capacitance ratio; identifying 606, in dependence upon the oneor more frequency scaling indexes 607, a predetermined frequencyparameter 609 for the processor 603; and transitioning 608, based on thefrequency parameter 609, the processor 603 to a target clock frequency.

As discussed above, a digital droop sensor mitigates core voltage droopsby throttling instruction processing rates in core regions where thedroop is detected. A voltage droop may occur when a new load is createdduring a transition of an idle state to a high intensity workload.Accordingly, the example method of FIG. 12 further includes detecting1202 a voltage droop based on a core voltage falling below a corevoltage threshold. In some examples, a digital droop sensor 1201 of theprocessor 603 detects 1202 a voltage droop based on a core voltagefalling below a core voltage threshold by detecting a core voltage in aregion where the digital droop sensor 1201 is located and comparing thedetected core voltage to a programmable core voltage threshold. In someimplementations, detecting 1202 a voltage droop based on a core voltagefalling below a core voltage threshold is carried out by a digital droopsensor 1201 of the processor 603 registering a thermometer code binbased on a detected core voltage. If the thermometer code bin fallsbelow a programable thermometer code bin threshold, a voltage droop isdetected.

The example method of FIG. 12 also includes, in response to detectingthe voltage droop, throttling 1204 one or more regions of the core. Insome examples, the digital droop sensor 1201 throttles 1204 one or moreregions of the core by reducing an instruction processing rate in thoseregions. For example, the instruction processing rate may be reduced byreducing the local clock that is supplied to those regions or reducinglatch rates. In some examples, the digital droop sensor utilizes athrottle sequencing engine to dispatch throttle commands to local corethrottle controls that reduce throughput.

The example method of FIG. 12 also includes decreasing 1206,incrementally, an amount of throttling based on an increase in corevoltage. As discussed above, discontinuing core throttling when corevoltage rises above the core voltage threshold could trigger anothervoltage droop. Accordingly, the digital droop sensor 1201 decreases 1206an amount of throttling based on an increase in core voltage bygradually decreasing the amount of throttling that is applied to thecore regions while the core voltage adjusts to the new load created by atransition of an idle state to a high intensity workload. For example,the digital droop sensor may decrease throttling by an incrementalamount such that the core voltage is allowed to adjust to the increasein load caused by the decrease in throttling before throttling isfurther decreased. In some examples, the core voltage threshold isadjusted dynamically in response to the WOF controller 152 transitioningthe processor to a target clock frequency.

For further explanation, FIG. 13 sets forth a flow chart illustratinganother example method of providing deterministic frequency and voltageenhancements for a processor in accordance with some embodiments of thepresent disclosure. The example method of FIG. 13 includes performing1302 one or more droop mitigation actions. In some examples, performing1302 one or more droop mitigation actions may be carried out bydetecting 1202 a voltage droop based on a core voltage falling below acore voltage threshold, throttling 1204 one or more regions of the corein response to detecting the voltage droop, and decreasing 1206,incrementally, an amount of throttling based on an increase in corevoltage, as discussed above.

The example method of FIG. 13 also includes executing 1304 a voltageenhancement control loop in dependence upon the one or more droopmitigation actions. In some examples, executing 1304 the voltageenhancement control loop in dependence upon the one or more droopmitigation actions may be carried out by decreasing 1102, incrementally,the power supply voltage for the processor 603, determining 1104 that avoltage droop parameter exceeds a voltage droop parameter threshold, andincreasing 1106, incrementally, the power supply voltage in response todetermining that the voltage droop parameter exceeds a voltage droopparameter threshold, as discussed above. In some examples, the voltagedroop parameter is based on the number or rate of droop mitigationactions, or the number of cycles or percentage of cycles that a droopmitigation action is active.

As discussed above, some embodiments use a sensor to detect voltagedroops based on a core voltage falling below a core voltage threshold.When droops are detected, the effect of the droop is then mitigated byeither throttling one or more regions of the core in response todetecting the voltage droop, or reducing the core clock frequency, toprevent functional errors. When robust droop mitigation is employed, thevoltage can be reduced far below the safe minimum voltage without droopmitigation. When a droop is detected, the droop mitigation action(either throttling or frequency reduction) may result in a smallreduction in performance for the clock cycles when the mitigation actionis occurring. As the voltage is reduced, the fraction of clock cycleswith reduce performance increases, eventually resulting in measurableperformance loss at low voltage.

In some variations, the sensor that is used for voltage control isoffset or calibrated differently than the voltage sensor used for droopmitigation. For example, the sensor for voltage control is set to ahigher voltage than the sensor used to trigger the droop mitigationaction. In this case, the voltage can be controlled to higher values, toreduce the performance-loss from excessive droop-mitigation actions to alower level than is possible when exactly the same sensor-threshold isused for both droop mitigation and voltage control.

In view of the foregoing, it will be appreciated that providingdeterministic frequency and voltage enhancements for a processor inaccordance with present disclosure provides many advantages, includingbut not limited to: a) a workload optimized frequency controller thatdynamically monitors the processor and system conditions anddeterministically sets the processor frequency for maximum performance;b) robust droop mitigation using digital droop sensors combined withcore throttling that reduces the voltage guardband and increases maximumfrequency; and c) an undervolt voltage control loop that dynamicallyreduces VDD below VDD_(MAX) based on feedback from digital droopssensors, providing a boost to maximum frequency.

The present disclosure may be a system, a method, or a computer programproduct. The computer program product may include a computer readablestorage medium (or media) having computer readable program instructionsthereon for causing a processor to carry out aspects of the presentdisclosure.

The computer readable storage medium can be a tangible device that canretain and store instructions for use by an instruction executiondevice. The computer readable storage medium may be, for example, but isnot limited to, an electronic storage device, a magnetic storage device,an optical storage device, an electromagnetic storage device, asemiconductor storage device, or any suitable combination of theforegoing. A non-exhaustive list of more specific examples of thecomputer readable storage medium includes the following: a portablecomputer diskette, a hard disk, a random access memory (RAM), aread-only memory (ROM), an erasable programmable read-only memory (EPROMor Flash memory), a static random access memory (SRAM), a portablecompact disc read-only memory (CD-ROM), a digital versatile disk (DVD),a memory stick, a floppy disk, a mechanically encoded device such aspunch-cards or raised structures in a groove having instructionsrecorded thereon, and any suitable combination of the foregoing. Acomputer readable storage medium, as used herein, is not to be construedas being transitory signals per se, such as radio waves or other freelypropagating electromagnetic waves, electromagnetic waves propagatingthrough a waveguide or other transmission media (e.g., light pulsespassing through a fiber-optic cable), or electrical signals transmittedthrough a wire.

Computer readable program instructions described herein can bedownloaded to respective computing/processing devices from a computerreadable storage medium or to an external computer or external storagedevice via a network, for example, the Internet, a local area network, awide area network and a wireless network. The network may comprisecopper transmission cables, optical transmission fibers, wirelesstransmission, routers, firewalls, switches, gateway computers and edgeservers. A network adapter card or network interface in eachcomputing/processing device receives computer readable programinstructions from the network and forwards the computer readable programinstructions for storage in a computer readable storage medium withinthe respective computing/processing device.

Computer readable program instructions for carrying out operations ofthe present disclosure may be assembler instructions,instruction-set-architecture (ISA) instructions, machine instructions,machine dependent instructions, microcode, firmware instructions,state-setting data, or either source code or object code written in anycombination of one or more programming languages, including an objectoriented programming language such as Smalltalk, C++ or the like, andconventional procedural programming languages, such as the “C”programming language or similar programming languages. The computerreadable program instructions may execute entirely on the user’scomputer, partly on the user’s computer, as a stand-alone softwarepackage, partly on the user’s computer and partly on a remote computeror entirely on the remote computer or server. In the latter scenario,the remote computer may be connected to the user’s computer through anytype of network, including a local area network (LAN) or a wide areanetwork (WAN), or the connection may be made to an external computer(for example, through the Internet using an Internet Service Provider).In some embodiments, electronic circuitry including, for example,programmable logic circuitry, field-programmable gate arrays (FPGA), orprogrammable logic arrays (PLA) may execute the computer readableprogram instructions by utilizing state information of the computerreadable program instructions to personalize the electronic circuitry,in order to perform aspects of the present disclosure.

Aspects of the present disclosure are described herein with reference toflowchart illustrations and block diagrams of methods, apparatus(systems), and computer program products according to embodiments of thedisclosure. It will be understood that each block of the flowchartillustrations and block diagrams, and combinations of blocks in theflowchart illustrations and block diagrams, can be implemented bycomputer readable program instructions.

These computer readable program instructions may be provided to aprocessor of a general-purpose computer, special purpose computer, orother programmable data processing apparatus to produce a machine, suchthat the instructions, which execute via the processor of the computeror other programmable data processing apparatus, create means forimplementing the functions/acts specified in the flowcharts and blockdiagram block or blocks. These computer readable program instructionsmay also be stored in a computer readable storage medium that can directa computer, a programmable data processing apparatus, and other devicesto function in a particular manner, such that the computer readablestorage medium having instructions stored therein comprises an articleof manufacture including instructions which implement aspects of thefunction/act specified in the flowcharts and block diagram block orblocks.

The computer readable program instructions may also be loaded onto acomputer, other programmable data processing apparatus, or other deviceto cause a series of operational steps to be performed on the computer,other programmable apparatus, or other device to produce a computerimplemented process, such that the instructions which execute on thecomputer, other programmable apparatus, or other device implement thefunctions/acts specified in the flowcharts and block diagram block orblocks.

The flowcharts and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods, and computer program products according to variousembodiments of the present disclosure. In this regard, each block in theflowcharts or block diagrams may represent a module, segment, or portionof instructions, which comprises one or more executable instructions forimplementing the specified logical function(s). In some alternativeimplementations, the functions noted in the block may occur out of theorder noted in the figures. For example, two blocks shown in successionmay, in fact, be executed substantially concurrently, or the blocks maysometimes be executed in the reverse order, depending upon thefunctionality involved. It will also be noted that each block of theblock diagrams and flowchart illustrations, and combinations of blocksin the block diagrams and flowchart illustrations, can be implemented byspecial purpose hardware-based systems that perform the specifiedfunctions or acts or carry out combinations of special purpose hardwareand computer instructions.

It will be understood from the foregoing description that modificationsand changes may be made in various embodiments of the present inventionwithout departing from its true spirit. The descriptions in thisspecification are for purposes of illustration only and are not to beconstrued in a limiting sense. The scope of the present invention islimited only by the language of the following claims.

What is claimed is:
 1. A method of providing deterministic frequency andvoltage enhancements on a processor, the method comprising: identifyinga plurality of parameters related to a processor, the plurality ofparameters including at least a current supplied to the processor;determining, in dependence upon the plurality of parameters, one or morefrequency scaling indexes including determining an effective switchingcapacitance ratio; identifying, in dependence upon the one or morefrequency scaling indexes, a predetermined frequency parameter for theprocessor; and transitioning, based on the frequency parameter, theprocessor to a target clock frequency.
 2. The method of claim 1, whereinthe plurality of parameters further includes one or more of: an ambienttemperature, an altitude, one or more input/output (I/O) configurationparameters, one or more core power states, one or more core clockstates, an average voltage, and an average frequency.
 3. The method ofclaim 1, wherein the predetermined frequency parameter is identifiedfrom a table that maps the one or more frequency scaling indexes to thepredetermined frequency parameter.
 4. The method of claim 1, whereindetermining, based on the plurality of parameters, one or more frequencyscaling indexes includes determining a core activity state index.
 5. Themethod of claim 1, wherein determining, based on the plurality ofparameters, one or more frequency scaling indexes includes determiningan input/output power index.
 6. The method of claim 1, whereindetermining, based on the plurality of parameters, one or more frequencyscaling indexes includes determining an ambient conditions index.
 7. Themethod of claim 1, wherein transitioning, based on the target clockfrequency, the processor to a target clock frequency includes setting,based on the target clock frequency, a target power supply voltage forthe processor.
 8. The method of claim 7 further comprising: decreasing,incrementally, the power supply voltage for the processor; determiningthat a voltage droop parameter exceeds a voltage droop parameterthreshold; and increasing, incrementally, the power supply voltage inresponse to determining that the voltage droop parameter exceeds avoltage droop parameter threshold.
 9. The method of claim 1 furthercomprising: detecting a voltage droop based on a core voltage fallingbelow a core voltage threshold; in response to detecting the voltagedroop, throttling one or more regions of the core; and decreasing,incrementally, an amount of throttling based on an increase in corevoltage.
 10. The method of claim 9, wherein the core voltage thresholdis adjusted dynamically in response to transitioning, based on thefrequency parameter, the processor to a target clock frequency.
 11. Anapparatus comprising: a processor; and a memory storing instructionsthat, when executed by the processor, configure the apparatus to:identify a plurality of parameters related to a processor, the pluralityof parameters including at least a current supplied to the processor;determine, in dependence upon the plurality of parameters, one or morefrequency scaling indexes including determining an effective switchingcapacitance ratio; identify, in dependence upon the one or morefrequency scaling indexes, a predetermined frequency parameter for theprocessor; and transition, based on the frequency parameter, theprocessor to a target clock frequency.
 12. The apparatus of claim 11wherein the predetermined frequency parameter is identified from a tablethat maps the one or more frequency scaling indexes to the predeterminedfrequency parameter.
 13. The apparatus of claim 11, wherein determining,based on the plurality of parameters, one or more frequency scalingindexes includes determining a core activity state index.
 14. Theapparatus of claim 11, wherein determining, based on the plurality ofparameters, one or more frequency scaling indexes includes determiningan input/output power index.
 15. The apparatus of claim 11, whereindetermining, based on the plurality of parameters, one or more frequencyscaling indexes includes determining an ambient conditions index.
 16. Amethod of providing deterministic frequency and voltage enhancements ona processor, the method comprising: decreasing, incrementally, a powersupply voltage for a processor; determining that a voltage droopparameter exceeds a voltage droop parameter threshold; and increasing,incrementally, the power supply voltage in response to determining thatthe voltage droop parameter exceeds a voltage droop parameter threshold.17. The method of claim 16 further comprising: detecting a voltage droopbased on a core voltage falling below a core voltage threshold; inresponse to detecting the voltage droop, throttling one or more regionsof the core; and decreasing, incrementally, an amount of throttlingbased on an increase in core voltage.
 18. The method of claim 17,wherein the core voltage threshold is adjusted dynamically in responseto transitioning, based on the frequency parameter, the processor to atarget clock frequency.
 19. The method of claim 16, wherein the voltagedroop parameter is at least one of a number of voltage droop events, arate of voltage droop events, a number of cycles that a droop mitigationaction is active, and a fraction of cycles that the droop mitigationaction is active.
 20. The method of claim 16, wherein a size of a powersupply voltage increment is dynamically selected based on the voltagedroop parameter; and wherein a size of a power supply voltage decrementis dynamically selected based on the voltage droop parameter.